-
Notifications
You must be signed in to change notification settings - Fork 380
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[#5889] feat(client-python): Add model management Python API #6009
base: main
Are you sure you want to change the base?
Conversation
resp = self.rest_client.get( | ||
self._format_model_request_path(model_full_ns), | ||
error_handler=MODEL_ERROR_HANDLER, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds a little bit dangerous if the GET doesn't work as expected, or the return is not application/json
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think such kind of exceptions/errors is handled by rest_client, we don't need to handle it here in this class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK.
""" | ||
|
||
@abstractmethod | ||
def version(self) -> int: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we want to use string as version number for generic scenarios.
If this means an internal representation, maybe we want to call it something like revision
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the current design, version is a self-incremented integer version number, in the meantime, we also have "aliases" for the convenience of use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK.
clients/client-python/gravitino/client/generic_model_catalog.py
Outdated
Show resolved
Hide resolved
clients/client-python/gravitino/client/generic_model_catalog.py
Outdated
Show resolved
Hide resolved
except NoSuchModelVersionException: | ||
return False | ||
|
||
def get_model_version_by_alias( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get_model_alias
for consistent method naming style?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My feeling is that the name may be misleading, users may feel that this method is to get the alias, not the model version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was comparing this to the get_model_version
method, which can be interpreted as "getting a model by its specific version".
If that is true, we can name this to get_model_alias
.
After revisiting this collection of methods, I see what this method was designed for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me clarify the concept of model management in Gravitino. "Model" is a concept mapping to an ML model, "Model Version" is a snapshot of the specific "Model", one "Model" can have multiple "Model Versions". Besides, one "Model Version" can have multiple aliases, so users can use either "version" or "alias" to get a specific "Model Version".
So here, get_model_version
means get a specific model version entity by version number, and get_model_version
means get a specific model version entity by alias. It is not getting a model, it is getting a model version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. Now I think I get the intention behind the "model version" thing and I know where all these come from.
TBH, the concept of "model version" as an entity was not so straightforward to me,
because "version" in my mind was not supposed to be treated as first-class entities.
A "version" has to be associated with a model, it is not supposed to managed separately. The "version" value can be associated with more than one model,
if we are using this term to represent a semantic version.
I did get that the "version" in this context has nothing to do with semantic
version, it is actually a sequence number, used to form a unique key for an entity.
Now we have a dilemma. We still need an attribute for "version" in the traditional
sense. So we invented a new term "alias", which further complicates the conceptual
design as a whole. Can/shall we promote "alias" to be a version string?
self.link_model_version(ident, uri, aliases, comment, properties) | ||
return model | ||
|
||
@staticmethod |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer moving all static methods out of the class considering that they are simply utility functions.
_name: str = field(metadata=config(field_name="name")) | ||
_comment: Optional[str] = field(metadata=config(field_name="comment")) | ||
_properties: Optional[Dict[str, str]] = field( | ||
metadata=config(field_name="properties") | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These class properties seem useless?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They're mainly used for json serialization.
clients/client-python/gravitino/dto/responses/model_response.py
Outdated
Show resolved
Hide resolved
@@ -34,13 +33,13 @@ def __init__(self, levels: List[str]): | |||
self._levels = levels | |||
|
|||
def to_json(self): | |||
return json.dumps(self._levels) | |||
return self._levels |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we do this changes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a bug in the previous code, it treats a json list as a string, which will lead to failure in deserialization.
Raises: | ||
IllegalArgumentException if the request is invalid | ||
""" | ||
if not self._uri: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember we have checked the validness of alias, it seems that the same logic is missing.
What changes were proposed in this pull request?
This PR proposes to add Python client API for model management.
Why are the changes needed?
This is part of work to support model management in Gravitino.
Fix: #5889
Does this PR introduce any user-facing change?
No.
How was this patch tested?
UT added.